Compression complexity with ordinal patterns for robust causal inference in irregularly sampled time series

Distinguishing cause from effect is a scientific challenge resisting solutions from mathematics, statistics, information theory and computer science. Compression-Complexity Causality (CCC) is a recently proposed interventional measure of causality, inspired by Wiener–Granger’s idea. It estimates causality based on change in dynamical compression-complexity (or compressibility) of the effect variable, given the cause variable. CCC works with minimal assumptions on given data and is robust to irregular-sampling, missing-data and finite-length effects. However, it only works for one-dimensional time series. We propose an ordinal pattern symbolization scheme to encode multidimensional patterns into one-dimensional symbolic sequences, and thus introduce the Permutation CCC (PCCC). We demonstrate that PCCC retains all advantages of the original CCC and can be applied to data from multidimensional systems with potentially unobserved variables which can be reconstructed using the embedding theorem. PCCC is tested on numerical simulations and applied to paleoclimate data characterized by irregular and uncertain sampling and limited numbers of samples.

www.nature.com/scientificreports/ for the autonomous or master system, and for the response or slave system. Parameters were set as: a 1 = a 2 = 0.15 , b 1 = b 2 = 0.2 , c 1 = c 2 = 10.0 , and frequencies set as: ω 1 = 1.015 and ω 2 = 0.985 . The coupling parameter, ǫ , was fixed to 0.09. The data were generated by numerical integration based on the adaptive Bulirsch-Stoer method 60 using a sampling interval of 0.314 for both the master and slave systems. This procedure gives 17-21 samples per one period. 100 realizations of these systems were simulated and initial 5000 transients were removed before using the data for testing experiments.
As can be seen from the equations, there is a coupling between x 1 and x 2 , with x 1 influencing x 2 . The analysis of the causal influence between the two systems was done using the causality estimation measures: bivariate or scalar CCC, CMI, PCCC and PCMI for the cases outlined in the following paragraphs. The estimation procedure for each of the methods is described in the "Methods" section. The values of parameters used for each of the methods are also given in the "Methods" section ( Table 2).
Finite length data. The length of time series, N, of x 1 and x 2 taken from coupled Rössler systems was varied as shown in Fig. 1. The estimation for CMI and PCMI is done up to a higher value of length as CMI did not give optimal performance until the length became 32,768 samples. Figure 1c shows scalar (simple bivariate) CMI or one-dimensional CMI (CMI1) between x 1 and x 2 (see Paluš and Vejmelka 22 ). This method has high sensitivity but suffers from low specificity. This problem is solved by using conditional CMI or three-dimensional CMI (CMI3), where the information from other variables ( y 1 , z 1 , y 2 , z 2 ) is incorporated in the estimation. Its performance is depicted in Fig. 1e. However, it requires larger length of time series for optimal performance. Figure 1a shows the performance of scalar (or simple bivariate) CCC, which is equivalent to the CMI1 case, considering dimensionality. Figure 1b, d show the performance of PCCC and PCMI respectively. For each length level, all 100 realizations of coupled systems were considered and 100 surrogates generated for each realization in order to perform significance analysis of causality estimated (in both directions) from each realization of coupled processes. These surrogates were generated for both the processes using the Amplitude Adjusted Fourier Transform method 61 and significance testing done using a standard one-sided z-test with p-value set to 0.05 (this was justified as the distributions of surrogates for CCC and CMI methods implemented were found to be Gaussian). Based on this significance analysis, true positive rate (TPR) and false positive rate (FPR) were computed at each length level. A true positive is counted for a particular realization of coupled systems when causality estimated from x 1 to x 2 is found to be significant and a false positive is counted when causality estimated from x 2 to x 1 is found to be significant.
As it can be seen from the plots, direct application of scalar CCC completely fails on multidimensional dynamical systems data, yielding low true positives and high false positives. Hence the method displays poor sensitivity as well as specificity. CMI1 also shows poor performance, yielding high false positives. CMI3, which is appropriate to be applied for multi-dimensional data, only begins to give good performance when the length of time series is taken to be greater than 32,768 samples. On the other hand, PCCC begins to give high true positives and low false positives, as the length of time series is increased to 1024 time points, with TPR and FPR reaching almost 1 and 0 respectively as length is increased to 2048 time points. The use of permutation patterns also improves the performance of CMI3 for short length data as it can be seen that PCMI begins to show a TPR of 1 and FPR of 0 for length of time series equal to 2048 time points.
We did further experiments with simulated Rössler data by varying the amount of noise and missing samples in the data. For these cases, performance of PCCC and PCMI alone were evaluated because it can be seen from the 'varying length' experiments that scalar CCC and CMI1 do not work for multidimensional dynamical systems data and CMI3 does not perform well for short length data.
Noisy data. White Gaussian noise was added to the simulated Rössler data. The amount of noise added to the data was relative to the standard deviation of the data. The noise standard deviation ( σ n ), is expressed as a percentage of the standard deviation of the original data ( σ s ). For example, 20% of noise means σ n = 0.2σ s , and 100% of noise means σ n = σ s . The length of time series taken for this experiment was fixed to 2048. For each realization of noisy data as well, 100 surrogate time series were generated and significance testing performed as before using the Amplitude Adjusted Fourier Transform method and z-test respectively. Figure 2a,b show the results for varying noise in the data for the measures PCCC and PCMI respectively.
It can be seen that PCCC performs well for low levels of noise, up to 10% , but at higher levels of noise, its performance begins to deteriorate. PCMI, on the other hand, shows high TPR and low FPR even as the noise level is increased to 50%.
Sparse data. We refer to time-series with missing samples as sparse data. Sparsity or non-uniformly missing samples were introduced in the data in two ways: (1) Synchronous sparsity and (2) Asynchronous sparsity. In case of (1), samples were missing from both x 1 and x 2 at randomly chosen time indices and this set of time indices was the same for both x 1 and x 2 . In case of (2), samples were missing from both x 1 and x 2 based on two dif-(1) www.nature.com/scientificreports/ ferent sets of randomly chosen time indices, that is, the time indices of missing samples were different for x 1 and x 2 . The amount of synchronous/ asynchronous sparsity is expressed in terms of percentage of missing samples relative to the original length of time series taken. α sync and α async refer to the level of missing samples for the cases of synchronous and asynchronous sparsity respectively, and are given by m/N, where m is the number of missing samples and N is the original length of time series. N was fixed to 2048. The length of time series became shorter as the percentage of missing samples were increased. Causality estimation measures were applied to the data without any knowledge of whether any samples were missing or the time stamps at which the samples were missing. Surrogate data generation for each realization in this case was not done post the introduction of missing samples but prior to that, using the original length time series. Sparsity was then introduced in the surrogate time series in a manner similar to that for original time series. Figure 2c,d show the results obtained using PCCC and PCMI respectively for synchronous sparsity. Figure 2e,f show the same for asynchronous sparsity. It can be seen that PCCC is robust to the introduction of missing samples, showing high TPR and low FPR. FPR begins to be greater than 0.2 only when the level of synchronous sparsity is increased to 25% and asynchronous sparsity is increased to 20%. PCMI is robust to low levels of synchronous sparsity but deteriorates beyond 5% of missing samples, giving low true positives. It performs very poorly even with low levels of asynchronous sparsity.

Real data analysis.
As discussed in the Introduction, a number of climate datasets are either sampled at irregular intervals, have missing samples, are sampled after long intervals of time or have a combination of two or more of these issues. In addition, their temporal recordings available are short in length. We apply the   63 . One data point for both CO 2 and temperature recordings were available for each million year period and was used in our analysis to check for causal interaction between between the two. CO 2 , CH 4 and temperature recordings over the last 800,000 years. Past Interglacials Working Group of PAGES 64 has made available proxy records of atmospheric CO 2 , CH 4 and deepwater temperatures over the last 800 ka (1 ka= 1000 years). Each of these time series were reconstructed by separate studies and so the recordings available are non-synchronous and also irregularly sampled for each variable. Further, some data points are missing in the www.nature.com/scientificreports/ temperature time-series. Roughly, single data point is available for each ka for each of the three variables. CO 2 proxy data are based on antarctic ice core composites. This was first reported by Lüthi et al. 65 and the revised values made available in a study by Bereiter et al. 66 . Reconstructed atmospheric CH 4 concentrations, also based on ice cores, were as reported by Loulergue et al. 67 (on the AICC2012 age scale 68 ). Deepwater temperature recordings obtained using shallow-infaunal benthic foraminifera (Mg/Ca ratios) that became available from Ocean Drilling Program (ODP) site 1123 on the Chatham Rise, east of New Zealand were reported by Elderfield et al. 69 .
Causal influence was checked between CO 2 -temperature and separately between CH 4 -temperature. CO 2 and CH 4 data are taken beginning from the 6.5th ka on the AICC2012 scale and temperature data are taken beginning from the 7th ka. Since the number of data points available for temperature are 792, CO 2 -temperature analysis was done based on these 792 samples and as the number of samples of CH 4 is limited to 756 beginning from the 6.5th ka, CH 4 -temperature analysis was done using these 756 data points.
Monthly CO 2 -temperature dataset. Monthly mean CO 2 data constructed from mean daily CO 2 values as well as Northern Hemisphere's combined land and ocean temperature anomalies for the monthly timescale are available open source on the National Oceanic and Atmospheric Administration (NOAA) website. The CO 2 measurements were made at the Mauna Loa Observatory, Hawaii. A part of the CO 2 dataset (March 1958-April 1974) were originally obtained by C. David Keeling of the Scripps Institution of Oceanography and are available on the Scripps website. NOAA started its own CO 2 measurements starting May 1974. The temperature anomaly dataset is constructed from the Global Historical Climatology Network-Monthly data set 70 and International Comprehensive Ocean-Atmosphere Data Set, also available on the NOAA website. These data from March, 1958 to June 2021 (with 760 data points) were used to check for the causal influence between CO 2 and temperature on the recent timescale. Both time series were differenced using consecutive values as they were highly non-stationary.  73 . All India monthly rainfall dataset from 1871 to 2016, available on the official website of World Meteorological Organization and originally acquired from 'Indian Institute of Tropical Meteorology' , was used for analysis. These recordings are in the units of mm/month. Causal influence was checked between these two recordings using 1752 data points, ranging from the month January, 1871 to December, 2016.  76 . Central European 500 year temperature reconstruction dataset, beginning from 1500 AD, is made available open source by NOAA National Centers for Environmental Information, under the World Data Service for Paleoclimatology. These were derived in the study 77 . We took winter only data points (months December, January and February) starting from the December of 1658 to the February of 2001 as it is known that the NAO influence is strongest in winter. This yielded a total of 1029 data points. However, reconstruction based on embedding was done for each year's winter separately (with a time delay of 1) and not in a continuous manner as for the other datasets, reducing the length of ordinal patterns encoded sequence to 343. Causal influence was checked between NAO and temperature for the encoded sequences using PCMI and PCCC and directly using one-dimensional CMI and CCC for the 1029 length sequences.
Daily NAO-temperature recordings. Daily NAO records are available on the NOAA website and have been published in Refs. [78][79][80] . Daily mean surface air temperature data from the Frankfurt station in Germany were taken from the records made available online by the ECA &D project 81 . This data was taken from 1st January 1950 to 31st April 2021. Once again, daily values from the winter months alone (December, January and February), comprising of 6390 data points, were extracted for the analysis. While embedding the two time series, care was taken not to embed the recordings of winter from one year along with that of winter from the next year. Causal influence was checked between daily winter NAO and temperature time-series.
For the analysis of causal interaction in each of these datasets, scalar CCC and CMI as well as PCCC and PCMI were computed as discussed in the "Methods" section. Parameters used for each of the methods are also given in the "Methods" section ( Table 2). In order to assess the significance of causality value estimated using each measure, 100 surrogate realizations were generated using the stationary bootstrap method 82 for both the time series under consideration. Resampling of blocks of observations of random length from the original time series is done for obtaining surrogate time series using this method. The length of each block has a geometric distribution. The probability parameter that determines the geometric probability distribution for length of each block was set to 0.1 (as suggested in Ref. 82 ). Significance testing of the causal interaction between original time-series was then done using a standard one-sided z-test, with p-value set to 0.05. Table 1 shows whether causal influence between the considered variables was found to be significant using each of the causality measures. Figure 3 depicts the value of the PCCC between original pair of time series with respect to the distribution of PCCC www.nature.com/scientificreports/ obtained using surrogate time series for two datasets: kilo-year scale CO 2 -temperature (Fig. 3a,b) and yearly scale ENSO-SASM (Fig. 3c,d) recordings. In the tables, Fig. 3 and in the following text, we use the notation 'T' to refer to temperature generically. Which of the temperature recordings is being referred to, will be clear from context.

Discussion and conclusions
CCC has been proposed as an 'interventional' causality measure for time series. It does not require cause-effect separability in time series samples and is based on dynamical evolution of processes, making it suitable for subsampled time series, time series in which cause and effect are acquired at slightly different spatio-temporal scales than the scales at which they naturally occur and even when there are slight discrepancies in spatio-temporal scales of the cause and effect time series. This results in its robust performance in the case of missing samples, non-uniformly sampled, decimated and short length data 41 . In this work, we have proposed the use of CCC in combination with ordinal pattern encoding. The latter preserves the dynamics of the systems of observed variables, allowing for CCC to decipher causal relationships between variables of multi-dimensional systems while conditioning for the presence of other variables in these systems which might be unknown or unobserved. Simulations of coupled Rössler systems illustrate how scalar CCC is a complete failure for observables of coupled multi-dimensional dynamical systems, while PCCC performs well to determine the correct direction of coupling. Comparison of PCCC with PCMI for these simulations shows that the former beats the latter by showing better performance on shorter lengths of time series. Further, while PCMI consistently gave superior performance for increasing noise in coupled Rössler systems, experiments with sparse data showed that PCCC outperforms PCMI. This was the case when samples were missing from the driver and response time series either in a synchronous or asynchronous manner.
As PCCC showed promising results for simulations with high levels of missing samples and short length, we have applied it to make causal inferences in datasets from climatology and paleoclimatology which suffer from the issues of irregular sampling, missing samples and (or) have limited number of data points available. Many of these datasets have been analyzed in previous studies. However, different studies report different results probably due to the challenging nature of their recordings available or the limitation of the inference methods applied to work on the data.
For example, the relationship between CO 2 concentrations and temperature of the atmosphere has been studied from the mid 1800s 83,84 , beginning when a strong link between the two was recognized. Relatively recently, with causal inference tools available, a number of studies have begun to look at the directionality of relationship between the two on different temporal scales. To mention a few findings, Kodra et al. 85 found that CO 2 Granger causes temperature. Their analysis was based on data taken from 1860 to 2008. Atanassio 86 found a clear evidence of GC from CO 2 to temperature using lag-augmented Wald test, for a similar time range. On the other hand, Stern and Kaufmann 87 found bidirectional GC between the two, again for a similar time range. Kang and Larsson 88 also find bidirectional causation between the two using GC, however, by using data from ice cores for the last 800,000 years. Many of these latter studies criticize the former. Also, the drawbacks of one or more of these studies are explicitly mentioned in Refs. 87,89,90 and highlight the issues with the data and/ or the methodology employed. Other than GC and its extensions, a couple of other measures have also been used to study CO 2 -T relationship. Stips et al. 91 have applied a measure called Liang's Information flow on CO 2 -T recordings, both on recent (1850-2005) and paleoclimate (800 ka ice core reconstructions) time-scales. The study finds unidirectional causation from CO 2 → T on the recent time-scale and from T → CO 2 on the paleoclimatic scale.  www.nature.com/scientificreports/ They have also analysed the CH 4 -T relationship and found T to drive CH 4 on the paleoclimate scale. This study has been criticized by Goulet et al. 92 . They show that an assumption of 'linearity' made by Liang's information flow is nearly always rejected by the data. Convergent cross mapping, which is applied to the 800 ka recordings in another study, finds a bidirectional causal influence between both CO 2 -T and CH 4 -T 93 . Another recent study, that infers causation using lagged cross-correlations between monthly CO 2 and temperature, taken from the period 1980-2019, has found a bidirectional relationship on the recent monthly scale, with the dominant influence being from T → CO 2 94 . In the light of the limitations of CCM 95,96 , especially for irregularly sampled or missing data 42 , and of the widely known pitfalls of correlation coefficient 97 , it is difficult to rely on the inferences of the latter two studies. PCCC indicates unidirectional causality from T → CO 2 on the paleoclimatic scale, using both millenial and kilo-year scale recordings. On the recent monthly scale, the situation is reversed with CO 2 driving T. These results are in line with some of the existing CO 2 -T causal analysis studies and clearly PCCC does not suffer the limitations of existing approaches. On the kilo-year scale, PCCC suggests that CH 4 drives T. While none of the above discussed causality studies have found this result, other works have suggested that methane concentrations modulate millenial-scale climate variability because of the sensitivity of methane to insolation 98,99 . Other   www.nature.com/scientificreports/ approaches implemented in this study -CCC, CMI, PCMI also do not duplicate the results obtained by PCCC because of their specific limitations such as the inability to work on multi-dimensional, short length or irregularly sampled data. ENSO events and the Indian monsoon are other major climatic processes of global importance 59 . The relationship between the two has been studied extensively, especially using correlation and coherence approaches [100][101][102][103][104][105] . While ENSO is normally expected to play a driving role, there is no clear consensus on the directionality of the relationship between the two processes. More recently, causal inference approaches have been used to study the nature of their coupling. In Refs. 106,107 , both linear and non-linear GC versions were implemented on monthly mean ENSO-Indian monsoon time series, ranging from the period 1871-2006 and bidirectional coupling was inferred between the two processes. Other studies have studied the causal relationship indirectly by analyzing the ENSO-Indian Ocean Dipole link. For example, in Ref. 108 , this connection was studied by applying GC on yearly reanalysis as well as model data ranging from 1950-2014. The study found robust causal influence of Indian Ocean Dipole on ENSO while the influence in opposite direction had lower confidence. Using PCCC, we find a bidirectional causal influence between yearly recordings of ENSO-SASM. However, on the shorter monthly scales, NINO is found to drive Indian Monsoon and there is insignificant effect in the opposite direction.
Although the NAO is known to be a leading mode of winter climate variability over Europe [109][110][111] , the directionality or feedback in NAO related climate effects has been studied by a few causality analysis studies 9,112,113 . We investigate the NAO-European temperatures relationship on both monthly and daily time scales using winter only data. While PCCC indicates that NAO drives central European temperatures with no significant feedback on the longer monthly scale, on the daily scale it shows no significant causation in either direction. On the other hand, CCC and CMI, based on one dimensional time series, indicate a strong influence from NAO to Frankfurt daily mean temperatures. This result indicates that the NAO influence on European winter temperature on the daily scale can be explained as a simple time-delayed transfer of information between scalar time series in which no role is played by higher-dimensional patterns, potentially reflected in ordinal coding. Such an information transfer in the atmosphere is tied to the transfer of mass and energy as indicated in the study of climate networks by Hlinka et al. 114 . CMI and PCMI estimates can be considered to be reliable for this analysis as the time-series analyzed are long, close to 6000 time points.
CCC is free of the assumptions of linearity, requirement of long-term stationarity, extremely robust to missing samples, irregular sampling and short length data; and its combination with permutation patterns allows it to make reliable inferences for coupled systems with multiple variables. Thus, we can expect our analysis and inferences presented here on some highly-researched and long-debated climatic interactions to be highly robust and reliable. We also expect that the use of PCCC on other challenging datasets from climatology and other fields will be helpful to shed light on the causal linkages in considered systems.

Methods
Compression-complexity causality (CCC) is defined as the change in the dynamical compression-complexity of time series y when y is seen to be generated jointly by the dynamical evolution of both y past and x past as opposed to by the reality of the dynamical evolution of y past alone. y past , x past are windows of a particular length L taken from contemporary time points of time series y and x respectively and y is a window of length w following y past 41 . Dynamical compression-complexity (CC) is estimated using the measure effort-to-compress (ETC) 115 and given by: Equation (3) computes the dynamical compression-complexity of y as a dynamical evolution of y past alone. Equation (4) computes the dynamical compression-complexity of y as a dynamical evolution of both y past and x past . CCC x past →y is then estimated as: Averaged CCC from x to y over the entire length of time series with the window y being slided by a step-size of δ is estimated as: If CC(�y|y past ) ≈ CC(�y|x past , y past ) , there is no causality from x to y. Surrogate time series are generated for both x and y and the CCC x→y values of the original and surrogate time series compared. If the CCC computed for original time series is statistically different from that of surrogate time series, we can infer the presence of causal relation from x → y 42 . CCC x→y can be both < or > 0 depending upon the nature or quality of the causal relationship 41 . The magnitude indicates the strength of causation.
Selection of parameters: L, w, δ and the number of bins, B, for symbolizing the time series using equidistant binning (ETC is applied to symbolic sequences) is done using parameter selection criteria given in the supplementary text of Ref. 41 .
Permutation compression-complexity causality is the causal inference technique proposed and implemented in this work. Given a pair of time series x 1 and x 2 from dynamical systems in which causation is to be checked from x 1 to x 2 , we first embed the time series of the potential driver ( x 1 here) in the following manner: (3) CC(�y|y past ) = ETC(y past + �y) − ETC(y past ), (4) CC(�y|y past , x past ) = ETC(y past + �y, x past + �y) − ETC(x past , y past ),  www.nature.com/scientificreports/ x 1 (t), x 1 (t + η), x 1 (t + 2η), . . . x 1 (t + (m − 1)η) , where η is the time delay and m is the embedding dimension of x 1 . η is computed as the first minimum of auto mutual information function. The embedded time-series at each time-point is then symbolized using permutation or ordinal patterns binning. For example, if m = 3 , the embedding at time point t is given as x 1 (t) = (x 1 (t), x 1 (t + η), x 1 (t + 2η)) . Symbols 0, 1, 2 are then used for labelling the pattern for x 1 (t) at each time point by sorting the embedded values in ascending order, with 2 being used for the highest value and 0 for the lowest. If two or more values are exactly same in x 1 (t) , they are labelled differently depending on the order of their occurrence, where the same value takes a smaller symbol at its first (or earlier) occurrence. However, this may lead to two or more different embedded vectors having the same ordinal representation. For example, the embeddings, (3,5,5), (3,3,5) and (3,3,3), all have an ordinal representation of (0, 1, 2). This limits the total number of possible patterns at time t to m! = 3! . Thus, x 1 (t) is symbolized to a one dimensional sequence consisting of m! possible symbols or bins. CCC is then estimated from x 1 (t) to x 2 (t) , using Eq. (6) after symbolizing x 2 (t) using standard equidistant binning with m! bins. Thus, Permutation binning is not done for the potential driver series as it was found from simulation experiments (Rössler data) that embedding the 'cause' alone works better for the CCC measure. Full dimensionality of the cause is necessary to predict the effect. Hence, embedding only the cause helps to recover the causal relationship. PCCC helps to take into account the multidimensional nature of the coupled systems. Parameter selection for PCCC is done in the same manner as for the case of CCC, using the symbolic sequences, x 1 (t) and x 2 (t) , for selection of the parameters. When PCCC is to be estimated from x 2 → x 1 , x 2 is embedded and x 1 remains as it is. Just like CCC, the PCCC measure can also take negative values. Conditional mutual information (CMI) of the variables X and Y given the variable Z is a common informationtheoretic functional used for the causality detection, and can be obtained as where H(X 1 , X 2 , ...|Z) = H(X 1 , X 2 , ...) − H(Z) is the conditional entropy, and the joint Shannon entropy H(X 1 , X 2 , ...) is defined as: where p(x 1 , x 2 , ...) = Pr[X 1 = x 1 , X 2 = x 2 , ...] is the joint probability distribution function of the amplitude of variables {X 1 , X 2 , ...} . In order to detect the coupling direction among two dynamical variables of X and Y, Paluš et al. 21 used the conditional mutual information I(X(t); Y (t + τ )|Y (t)) , that captures the net information about the τ -future of the process Y contained in the process X. As mentioned in the Introduction, to estimate other unknown variables, an m-dimensional state vector X can be reconstructed as X(t) = {x(t), x(t − η), ..., x(t − (m − 1)η)} . Accordingly, CMI defined above can be represented by its reconstructed version for all variables of X(t), Y (t + τ ) and Y(t). However, extensive numerical studies 22 demonstrated that CMI in the form is sufficient to infer direction of coupling among dynamical variables of X(t) and Y(t). In this respect, we use this measure to detect causality relationships in this article.
Permutation conditional mutual information (PCMI) can be obtained based on the permutation analysis described earlier in the PCCC definition. In this approach, all marginal, joint or conditional probability distribution functions of the amplitude of the variables are replaced by their symbolized versions, thus Eq. (9) should be replaced by where p(x 1 ,x 2 , ...) = Pr[X 1 =x 1 ,X 2 =x 2 , ...] is the joint probability distribution function of the symbolized variables X i (t) = {X i (t), X i (t + η), ..., X i (t + (m − 1)η)} . By using Eqs. (8) and (11), permutation CMI can be obtained as I(X(t);Ŷ (t + τ )|Ŷ (t)) . Finally, one should replace τ with τ + (m − 1)η in order to avoid any overlapping between the past and future of the symbolized variable Ŷ .
Parameters of the methods used were set as shown in Table 2 for different datasets.